Skip to content

fix(server): bind SSE replay to issuing transport instance in WebStandardStreamableHTTPServerTransport#2068

Closed
felixweinberger wants to merge 3 commits into
mainfrom
fweinberger/f5-shttp-scope
Closed

fix(server): bind SSE replay to issuing transport instance in WebStandardStreamableHTTPServerTransport#2068
felixweinberger wants to merge 3 commits into
mainfrom
fweinberger/f5-shttp-scope

Conversation

@felixweinberger
Copy link
Copy Markdown
Contributor


Targeted fix in packages/server/src/server/streamableHttp.ts: response routing uses per-request closures instead of the shared _streamMapping lookups where possible, and event replay verifies the Last-Event-ID belongs to the requesting session before replaying.

Motivation and Context

The existing transport class shares the same Last-Event-ID replay shape that R3's shttpHandler fixes; this PR ports that fix to the existing transport. The per-request closure change reduces the surface for cross-request routing mistakes.

How Has This Been Tested?

Existing streamableHttp.test.ts + new replay-ownership test. Conformance server 40/40.

Breaking Changes

None.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

On main (independent of the decomposition stack). Review guide: https://gist.github.com/felixweinberger/5a48e0f14d5aced39ed6a91b61940711.


…dardStreamableHTTPServerTransport

Per-instance _standaloneSseStreamId (lazy, prefers sessionId) and _issuedStreamIds set; replayEventsAfter rejects event IDs from a different instance. Same shape as the new shttpHandler path.
@felixweinberger felixweinberger added the v2-stateless 2026-06 SDK: Protocol decomposition + SEP alignment (request-first / stateless) label May 12, 2026
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 12, 2026

⚠️ No Changeset found

Latest commit: 6024430

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 12, 2026

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/@modelcontextprotocol/client@2068

@modelcontextprotocol/server

npm i https://pkg.pr.new/@modelcontextprotocol/server@2068

@modelcontextprotocol/express

npm i https://pkg.pr.new/@modelcontextprotocol/express@2068

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/@modelcontextprotocol/fastify@2068

@modelcontextprotocol/hono

npm i https://pkg.pr.new/@modelcontextprotocol/hono@2068

@modelcontextprotocol/node

npm i https://pkg.pr.new/@modelcontextprotocol/node@2068

commit: 6024430

@felixweinberger
Copy link
Copy Markdown
Contributor Author

@claude review

1 similar comment
@felixweinberger
Copy link
Copy Markdown
Contributor Author

@claude review

Comment thread docs/migration.md
Comment on lines +965 to +976
### EventStore: `getStreamIdForEventId` required for SSE replay

`EventStore.getStreamIdForEventId(eventId)` is now required for `Last-Event-ID` resumption.
The transport must verify the event belongs to a stream the requesting session issued
*before* replaying; without this method, replay returns HTTP 403. Implement it on your
event store (typically a lookup from event id to the `streamId` you stored it under).

```typescript
async getStreamIdForEventId(eventId: string): Promise<string | undefined> {
return this.events.get(eventId)?.streamId;
}
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 This new ### EventStore: getStreamIdForEventId required for SSE replay subsection is inserted directly under ## Need Help?, between that heading and its body ("If you encounter issues during migration: 1. Check the FAQ…"), so it renders as a child of Need Help? and splits that section in two — it should live under ## Breaking Changes alongside the other transport/server entries. Also, per CLAUDE.md ("document them in both: docs/migration.md … docs/migration-SKILL.md") and the repo review checklist, the companion docs/migration-SKILL.md needs a matching entry — it currently has nothing for getStreamIdForEventId/EventStore.

Extended reasoning...

What's wrong

There are two documentation issues with the new migration note added in this hunk.

1. Section is misplaced inside ## Need Help?

The diff inserts the new h3 at line 965, immediately after ## Need Help? (line 963) and before that section's own body ("If you encounter issues during migration: 1. Check the FAQ…", line 978). The rendered structure becomes:

## Need Help?
  ### EventStore: getStreamIdForEventId required for SSE replay   ← new note
  If you encounter issues during migration:                       ← original Need-Help body
    1. Check the FAQ …

So the breaking-change note is now a subsection of Need Help?, and the Need Help? heading is separated from its own list by an unrelated 12-line block. The content describes a behavioral requirement change (replay now fail-closes with HTTP 403 unless getStreamIdForEventId is implemented) — that belongs under ## Breaking Changes, not under the help/FAQ footer. CLAUDE.md:32 also says: "Search for related sections and group related changes together rather than adding new standalone sections."

2. docs/migration-SKILL.md not updated

CLAUDE.md:27–30 explicitly requires breaking changes to be documented in both files:

When making breaking changes, document them in both:

  • docs/migration.md — human-readable guide with before/after code examples
  • docs/migration-SKILL.md — LLM-optimized mapping tables for mechanical migration

The repo review checklist echoes this ("Breaking changes documented in docs/migration.md and docs/migration-SKILL.md"). The PR adds the note only to docs/migration.md; grep -E 'getStreamIdForEventId|EventStore' docs/migration-SKILL.md returns nothing (file is 539 lines and has zero matches). Since the PR author themselves treats this as migration-relevant ("is now required for Last-Event-ID resumption … without this method, replay returns HTTP 403"), the SKILL file should get a corresponding mapping entry.

Step-by-step proof

  1. docs/migration.md:963## Need Help?
  2. docs/migration.md:965new ### EventStore: getStreamIdForEventId required for SSE replay (added by this PR)
  3. docs/migration.md:967–976 — body + code block of the new note
  4. docs/migration.md:978 — "If you encounter issues during migration:" — the original first line of the Need Help? body, now pushed below the new h3.

Markdown headings nest by level, so the h3 at step 2 becomes a child of the h2 at step 1, and the FAQ list at step 4 renders as trailing content of the EventStore subsection rather than as the body of Need Help?.

For the second point: docs/migration-SKILL.md exists, is referenced from migration.md ("An LLM-optimized version of this guide is available at docs/migration-SKILL.md"), and contains no entry for this change.

Why nothing prevents it

There is no doc lint enforcing heading placement or migration.md ↔ migration-SKILL.md parity, so this only surfaces in review.

Impact

  • Readers scanning ## Breaking Changes will miss the new EventStore requirement entirely (it's hidden at the very bottom under Need Help?).
  • The Need Help? section reads oddly — heading, then an unrelated EventStore note, then "If you encounter issues during migration: …".
  • LLM-driven migrations using migration-SKILL.md (its stated purpose) won't know to add getStreamIdForEventId, leaving migrated EventStores returning 403 on replay.

Fix

  • Move the ### EventStore: getStreamIdForEventId required for SSE replay block up under ## Breaking Changes (e.g. near the StreamableHTTPServerTransport / server-transport entries), leaving ## Need Help? immediately followed by "If you encounter issues during migration: …".
  • Add a corresponding entry to docs/migration-SKILL.md (e.g. a row mapping EventStore → "implement getStreamIdForEventId(eventId) — required for Last-Event-ID replay; without it the transport returns 403").

Comment on lines +875 to +885
private _addIssuedStreamId(streamId: string): void {
if (this._issuedStreamIds.size >= WebStandardStreamableHTTPServerTransport._MAX_ISSUED_STREAM_IDS) {
for (const id of this._issuedStreamIds) {
if (id !== this._standaloneSseStreamId) {
this._issuedStreamIds.delete(id);
break;
}
}
}
this._issuedStreamIds.add(streamId);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 When _addIssuedStreamId is called with an id that's already in the set (e.g. the standalone GET stream id on a reconnect without Last-Event-ID) and the set is at the 256 cap, the eviction branch still runs and drops the oldest POST stream id even though the subsequent .add() is a no-op. A later Last-Event-ID resume on that evicted POST stream then gets a spurious 403. Trivial fix: early-return at the top with if (this._issuedStreamIds.has(streamId)) return;.

Extended reasoning...

What the bug is

_addIssuedStreamId (streamableHttp.ts:875–885) performs its FIFO eviction check before adding the new id, and does not check whether the incoming streamId is already present in the set:

private _addIssuedStreamId(streamId: string): void {
    if (this._issuedStreamIds.size >= WebStandardStreamableHTTPServerTransport._MAX_ISSUED_STREAM_IDS) {
        for (const id of this._issuedStreamIds) {
            if (id !== this._standaloneSseStreamId) {
                this._issuedStreamIds.delete(id);
                break;
            }
        }
    }
    this._issuedStreamIds.add(streamId);
}

If streamId is already a member, .add(streamId) is a no-op (Set semantics), so the set didn't actually need a slot freed — yet the eviction branch already dropped the oldest non-standalone POST stream id. Net effect: one valid POST stream id is evicted for nothing, and the set ends up at size = cap - 1 instead of cap.

Code path that triggers it

handleGetRequest calls this._addIssuedStreamId(this._standaloneSseStreamId) on every fresh standalone GET (line ~480). The 409 conflict guard above it only checks this._streamMapping.get(this._standaloneSseStreamId) — i.e. whether a standalone stream is currently active. When a prior GET stream is cancelled or closed, its cleanup() / cancel callback removes the entry from _streamMapping but the standalone id is intentionally kept in _issuedStreamIds (so replay ownership survives the source stream's close, per the field's doc comment). So a subsequent GET without Last-Event-ID passes the 409 check and reaches _addIssuedStreamId(standaloneId) again with that id already present.

Why existing code doesn't prevent it

  • The 409 guard is keyed on _streamMapping, not _issuedStreamIds, so it doesn't block re-adds of the standalone id.
  • Reconnects with Last-Event-ID go to replayEvents and skip this path, but reconnects without Last-Event-ID (e.g. a client that just reopens the GET stream) do hit it.
  • The eviction loop deliberately skips _standaloneSseStreamId, so the victim is always a POST stream id.

Step-by-step proof

  1. Session has accumulated 255 POST stream ids in _issuedStreamIds, plus the standalone GET id → size === 256 (the cap).
  2. Client's standalone GET stream disconnects → cancel deletes the standalone entry from _streamMapping; _issuedStreamIds is unchanged (still 256 entries, standalone id still present).
  3. Client reconnects with a fresh GET (no Last-Event-ID). 409 check passes since _streamMapping has no standalone entry.
  4. _addIssuedStreamId(standaloneId) runs: size (256) >= 256 → eviction loop deletes the oldest POST id, say postId_0. Then .add(standaloneId) is a no-op since it's already in the set. Set is now 255 entries.
  5. Client later sends GET with Last-Event-ID referencing an event from postId_0. replayEvents resolves streamId = postId_0, then !this._issuedStreamIds.has(postId_0) → returns 403 "Forbidden: event ID does not belong to this session", even though this transport instance did mint that stream.

Impact

Low. Triggering requires (a) ≥256 stream ids issued in a single session, (b) a standalone GET reconnect-without-Last-Event-ID while exactly at the cap. The evicted id is the oldest, which would have been evicted on the very next POST anyway under the FIFO bound — this just shifts the eviction one slot earlier and momentarily wastes a slot. But it's a demonstrable correctness flaw in newly-added code with a one-line fix, so worth tightening.

Fix

Add an early return at the top of _addIssuedStreamId:

private _addIssuedStreamId(streamId: string): void {
    if (this._issuedStreamIds.has(streamId)) return;
    if (this._issuedStreamIds.size >= WebStandardStreamableHTTPServerTransport._MAX_ISSUED_STREAM_IDS) {
        ...
    }
    this._issuedStreamIds.add(streamId);
}

(Equivalently, only run the eviction branch when !this._issuedStreamIds.has(streamId).)

Comment on lines +517 to +531
if (!this._eventStore.getStreamIdForEventId) {
return this.createJsonErrorResponse(
403,
-32_000,
'Forbidden: event store does not support session-scoped replay (getStreamIdForEventId required)'
);
}
const streamId = await this._eventStore.getStreamIdForEventId(lastEventId);
if (!streamId) {
this.onerror?.(new Error('Invalid event ID format'));
return this.createJsonErrorResponse(404, -32_001, 'Event not found');
}
if (!this._issuedStreamIds.has(streamId)) {
return this.createJsonErrorResponse(403, -32_000, 'Forbidden: event ID does not belong to this session');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 This PR is missing a changeset for @modelcontextprotocol/server even though it changes published runtime behavior — Last-Event-ID replay now returns HTTP 403 unless the EventStore implements getStreamIdForEventId (previously optional), and the standalone-GET stream id is now per-instance. Please add a changeset (the bot link suggests patch on @modelcontextprotocol/server and @modelcontextprotocol/node); given the migration.md addition explicitly says the method is "now required", you may also want to reconcile that with the PR description's "Breaking Changes: None".

Extended reasoning...

What's missing

changeset-bot reports "No Changeset found" on commit 6056de4, and ls .changeset/ confirms no entry was added for this PR. The repo actively uses changesets (60+ .changeset/*.md files, config.json, pre.json), so per convention every user-visible change to a published package needs one to surface in the changelog and trigger a version bump.

Why this PR needs one

The change to packages/server/src/server/streamableHttp.ts alters runtime behavior of the published @modelcontextprotocol/server package in three observable ways:

  1. replayEvents() now fails closed (403) when EventStore.getStreamIdForEventId is absent. Before this PR, the method was documented as "Optional: If not provided, the SDK will use the streamId returned by replayEventsAfter for stream mapping" and replay proceeded. After this PR, the same EventStore implementation gets 403 Forbidden: event store does not support session-scoped replay.
  2. Cross-instance Last-Event-ID values are rejected with 403 via the new _issuedStreamIds ownership check.
  3. The standalone SSE stream id changed from the constant '_GET_stream' to a per-instance '_GET_stream:<sessionId|uuid>', which affects any EventStore that keyed on the old constant.

All three are user-visible to anyone who has implemented a custom EventStore. The PR itself acknowledges this by adding a docs/migration.md section titled "EventStore: getStreamIdForEventId required for SSE replay" stating the method is "now required".

Step-by-step example

  1. A v2 user has a custom EventStore that implements only storeEvent and replayEventsAfter (valid before this PR — getStreamIdForEventId? is optional in the interface, and still is at the type level).
  2. They upgrade to a release containing this change.
  3. A client disconnects mid-stream and reconnects with Last-Event-ID: <id>.
  4. replayEvents() enters the new if (!this._eventStore.getStreamIdForEventId) branch (streamableHttp.ts:517-522) and returns 403 instead of replaying.
  5. Without a changeset, this behavior change never appears in the package changelog, so the user has no signal explaining why resumption broke.

Internal inconsistency

The PR description checks "Bug fix (non-breaking change)" and states "Breaking Changes: None", yet the migration guide addition says the method is "now required" and replay "returns HTTP 403" without it. Whether this is classified as a security-fix patch or a minor with a behavior note, the changeset is the right place to document it honestly so consumers see it in CHANGELOG.md.

Fix

Add a .changeset/*.md entry — at minimum:

---
"@modelcontextprotocol/server": patch
"@modelcontextprotocol/node": patch
---

Bind SSE Last-Event-ID replay to the issuing transport instance. EventStore implementations must now provide getStreamIdForEventId for replay to succeed (otherwise 403).

The changeset-bot comment already provides a one-click link to create this. This is a process/release-hygiene issue rather than a code defect, hence nit severity, but it's worth flagging because the bot warning is easy to dismiss and the behavior change is real.

@felixweinberger
Copy link
Copy Markdown
Contributor Author

@claude review

Comment thread packages/server/src/server/streamableHttp.ts
Comment thread packages/server/src/server/streamableHttp.ts
@felixweinberger
Copy link
Copy Markdown
Contributor Author

@claude review

Comment thread packages/server/src/server/streamableHttp.ts
Comment on lines +517 to 531
if (!this._eventStore.getStreamIdForEventId) {
return this.createJsonErrorResponse(
403,
-32_000,
'Forbidden: event store does not support session-scoped replay (getStreamIdForEventId required)'
);
}
const streamId = await this._eventStore.getStreamIdForEventId(lastEventId);
if (!streamId) {
this.onerror?.(new Error('Invalid event ID format'));
return this.createJsonErrorResponse(404, -32_001, 'Event not found');
}
if (!this._issuedStreamIds.has(streamId)) {
return this.createJsonErrorResponse(403, -32_000, 'Forbidden: event ID does not belong to this session');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The two new 403 branches in replayEvents() (lines 517-522 "event store does not support session-scoped replay" and 529-530 "event ID does not belong to this session") return without calling this.onerror?.(...), whereas every other createJsonErrorResponse call site in this file — including the adjacent 400/404/500 branches and the DNS-rebinding 403s — pairs the response with a matching onerror call. Branch (1) is a server misconfiguration (now-required EventStore method missing) and branch (2) is the cross-session IDOR rejection this PR exists to add — both are signals an operator would plausibly want surfaced via the transport's onerror hook. Consider adding the matching this.onerror?.(new Error(...)) calls before each return.

Extended reasoning...

What's inconsistent

replayEvents() gained two new 403 branches in this PR:

if (!this._eventStore.getStreamIdForEventId) {
    return this.createJsonErrorResponse(403, -32_000,
        'Forbidden: event store does not support session-scoped replay (getStreamIdForEventId required)');
}
...
if (!this._issuedStreamIds.has(streamId)) {
    return this.createJsonErrorResponse(403, -32_000,
        'Forbidden: event ID does not belong to this session');
}

Neither calls this.onerror?.(...) before returning. I checked every createJsonErrorResponse call site in the file (17 others, at lines 342, 352, 425, 452, 507, 527, 596, 658, 668, 679, 694, 705, 709, 855, 903, 911, 917, 942) — all of them pair the response with this.onerror?.(new Error(<same-or-equivalent message>)) on the line above. That includes the immediately adjacent branches in replayEvents() itself (line 506 'Event store not configured', line 526 'Invalid event ID format' / 404, line 595 catch → 500), and the existing DNS-rebinding 403s at lines 341/351. So the two new 403s are the only outliers in the file, and both were introduced by this PR.

Why this isn't intentional

One could argue "client-fault rejections don't need to notify the server", but the file's existing convention says otherwise: the DNS-rebinding 403s (also security rejections of a hostile client), the 409 'Only one SSE stream is allowed', and the 404 'Session not found' all call onerror. The pre-PR code this hunk replaced (Conflict: Stream already has an active connection) also called onerror. So omitting it on the new branches reads as an oversight rather than a deliberate choice.

Why it matters (mildly)

These two branches are arguably more interesting to surface server-side than most of the others:

  • Branch (1) fires when the deployed EventStore is missing a method this PR makes mandatory — a server-side misconfiguration the operator would want in logs, since the client just sees a 403 and can't fix it.
  • Branch (2) fires when a caller presents a Last-Event-ID belonging to a different session — i.e. the cross-session IDOR probe this PR exists to block. That's exactly the kind of event an operator would want alerted on via the transport's onerror hook.

Without the onerror call, the server side gets no signal at all; only the (potentially malicious) client sees the 403.

Step-by-step proof

  1. Two transport instances A and B share one EventStore. Both have an onerror handler wired to the application's logger (the documented purpose of Transport.onerror).
  2. Session A opens a GET stream; the server sends a notification → event id evtA.
  3. An attacker holding session B's mcp-session-id sends GET /mcp with Last-Event-ID: evtA.
  4. replayEvents() on transport B: getStreamIdForEventId(evtA)'_GET_stream:<sidA>'; !this._issuedStreamIds.has('_GET_stream:<sidA>') → true.
  5. Line 530 returns a 403 to the attacker. No onerror fires — the operator's logger never sees the cross-session replay attempt. Compare with step 5 if the attacker had instead sent an unknown session id: validateSession would fire onerror?.(new Error('Session not found')) (line 916) and the probe would be logged.

Relationship to the existing line-526 comment

The existing review comment on line 528 is about the stale string in the onerror call on the 404 branch and explicitly says it "doesn't interact with any separate decision about whether the neighboring 403 branches should also call onerror". This note is that separate decision. It's not a duplicate; it's the complementary half of bringing this hunk in line with the file convention.

Addressing "is this worth a sixth comment?"

It's diagnostic-only (no protocol or correctness impact), so nit. But it's (a) a verified file-wide convention with zero other exceptions, (b) confined to lines this PR adds, (c) a one-line-per-branch fix, and (d) the two omitted cases are precisely the security/misconfig signals one would most want logged. Cheap to fix while the author is already touching the adjacent onerror string per the line-528 comment.

Fix

if (!this._eventStore.getStreamIdForEventId) {
    this.onerror?.(new Error('Forbidden: event store does not support session-scoped replay (getStreamIdForEventId required)'));
    return this.createJsonErrorResponse(403, -32_000, 'Forbidden: event store does not support session-scoped replay (getStreamIdForEventId required)');
}
...
if (!this._issuedStreamIds.has(streamId)) {
    this.onerror?.(new Error('Forbidden: event ID does not belong to this session'));
    return this.createJsonErrorResponse(403, -32_000, 'Forbidden: event ID does not belong to this session');
}

…means session terminated per spec; this branch runs for a live session)
@felixweinberger
Copy link
Copy Markdown
Contributor Author

@claude review

Comment on lines +1380 to +1384
// Required for replay ownership: see WebStandardStreamableHTTPServerTransport.replayEvents.
async getStreamIdForEventId(eventId: string): Promise<string | undefined> {
const sep = eventId.lastIndexOf('_');
return sep > 0 ? eventId.slice(0, sep) : undefined;
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Partial migration: this PR adds a correct getStreamIdForEventId here using lastIndexOf('_'), but the adjacent replayEventsAfter in the same fixture (line 1394) still derives the streamId via lastEventId.split('_')[0], which yields '' for the new _GET_stream:<sid>_<uuid> format — startsWith('') then matches every stored event and over-replays across streams. The same split('_')[0] divergence survives in examples/server/src/inMemoryEventStore.ts (private getStreamIdFromEventId, ~L22) and test/integration/test/taskResumability.test.ts (~L35), where it instead causes silent under-replay (early-return on falsy streamId). Since the PR touches all three files and adds the correct parser right next to each broken one, consider aligning the sibling replayEventsAfter parsers (use lastIndexOf('_') or the stored map). Distinct from the resolved replayedStreamId comment, which fixed the transport's consumption side; this is about the fixtures' own event filtering.

Extended reasoning...

What's inconsistent

This PR adds getStreamIdForEventId to three in-tree EventStore implementations so the transport's new ownership check can resolve event-id → stream-id. In each case the new method parses the id correctly (via lastIndexOf('_') or a map lookup), but the sibling replayEventsAfter in the same object still derives the stream id with the old split('_')[0] parser. The two parsers now disagree for standalone-GET event ids, which this PR changes to '_GET_stream:<sid>_<uuid>' (leading underscore).

Affected sites, all touched by this PR:

  • packages/middleware/node/test/streamableHttp.test.ts:1394lastEventId.split('_')[0]''; loop filter eventId.startsWith('') is always true → over-replays every stored event (including unrelated POST-stream events). The new getStreamIdForEventId 12 lines above (1381-1384) uses lastIndexOf('_') and would return '_GET_stream:<sid>'.
  • examples/server/src/inMemoryEventStore.ts:22 — private getStreamIdFromEventId returns parts[0] = ''; replayEventsAfter then hits if (!streamId) return ''zero events replayed for a standalone-GET resume. The new map-lookup getStreamIdForEventId added directly above it returns the correct stream id.
  • test/integration/test/taskResumability.test.ts:35lastEventId.split('_')[0] ?? ''''if (!streamId) return ''zero events replayed. New map-lookup getStreamIdForEventId added directly above.

Step-by-step proof (middleware fixture)

  1. Standalone GET stream id is '_GET_stream:abc'. Server stores a notification → event id '_GET_stream:abc_8f3e…' in storedEvents. A POST stream has also stored '<uuidP>_<uuidE>'.
  2. Client reconnects with Last-Event-ID: _GET_stream:abc_8f3e….
  3. Transport calls getStreamIdForEventId'_GET_stream:abc_8f3e…'.lastIndexOf('_') → slice → '_GET_stream:abc'. Ownership check passes.
  4. Transport calls replayEventsAfter('_GET_stream:abc_8f3e…', …).
  5. Line 1394: '_GET_stream:abc_8f3e…'.split('_')['', 'GET', 'stream:abc', '8f3e…']; [0]''.
  6. Loop: eventId.startsWith('') is true for every key in storedEvents, so the POST-stream event '<uuidP>_<uuidE>' is also sent on the GET replay.

For the example/integration stores, step 5 yields '' and the function early-returns '' before the loop, so nothing is replayed even though getStreamIdForEventId (and the transport) believe the resume is valid.

Why nothing catches it

After commit 4b8b389 the transport ignores replayEventsAfter's return value and registers the connection under the verified streamId, so the mapping is correct regardless. The remaining effect is purely the fixture's event filtering. The existing standalone-GET replay test in the middleware suite uses toContain assertions and so doesn't notice the extra over-replayed events; the integration test resumes a POST stream (UUID without leading _), so split('_')[0] happens to work there. CI is green.

The split('_')[0]'' problem for standalone-GET is technically pre-existing (the old constant '_GET_stream' also started with _), but this PR (a) touches all three files, (b) adds the correct parser directly adjacent to the broken one in each, and (c) is specifically about replay correctness — so it's the natural place to align them. This is the REVIEW.md → Recurring Catches → Completeness pattern: "When a PR replaces a pattern, grep for surviving instances of the old form — partial migrations leave sibling code paths with the very bug the PR claims to fix."

Relationship to the resolved replayedStreamId comment

Not a duplicate. The earlier (now-resolved) comment was about the transport registering the new connection under replayEventsAfter's return value; the fix was to ignore that return value. It cited the middleware fixture's parser divergence as evidence of inconsistency but did not ask for a fixture change, and did not mention the example or integration files. This note is about the fixtures'/example's own replay filtering, which is still wrong after the transport fix — different remedy, broader scope.

Impact

Low — test/example code only, no production package affected, no CI failure. The examples/server/src/inMemoryEventStore.ts case is the most user-relevant since it's the reference implementation users copy (it carries a "not for production" disclaimer, but it's still the canonical example).

Fix

In each replayEventsAfter, derive the stream id the same way the new getStreamIdForEventId does:

// middleware test fixture (line 1394)
const sep = lastEventId.lastIndexOf('_');
const streamId = sep > 0 ? lastEventId.slice(0, sep) : '';

// example + integration stores: just look it up
const streamId = this.events.get(lastEventId)?.streamId;
if (!streamId) return '';

(or drop the now-unused private getStreamIdFromEventId in the example entirely).

Comment on lines +41 to 44
* Required for `Last-Event-ID` resumption: the SDK uses this to verify the requesting
* caller owns the stream BEFORE replaying. If omitted, replay returns 403 (fail-closed).
*/
getStreamIdForEventId?(eventId: EventId): Promise<StreamId | undefined>;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The JSDoc now says "Required for Last-Event-ID resumption … If omitted, replay returns 403" and the migration guide says it "is now required", yet the interface still declares getStreamIdForEventId? as optional — and after this PR replayEventsAfter (which is non-optional) is unreachable without it. Consider dropping the ? (every in-tree EventStore already implements it, so it's free in-repo and gives external implementers a compile-time signal instead of a runtime 403); if you're keeping the ? purely for source-compat in a patch release, a one-line note in the JSDoc saying so would resolve the doc/type mismatch.

Extended reasoning...

What's inconsistent

The EventStore interface still declares the method as optional:

getStreamIdForEventId?(eventId: EventId): Promise<StreamId | undefined>;

but everything else this PR adds says otherwise:

  • The JSDoc immediately above (lines 41-42): "Required for Last-Event-ID resumption: the SDK uses this to verify the requesting caller owns the stream BEFORE replaying. If omitted, replay returns 403 (fail-closed)."
  • docs/migration.md: "EventStore.getStreamIdForEventId(eventId) is now required for Last-Event-ID resumption."
  • replayEvents() (lines 517-522): if (!this._eventStore.getStreamIdForEventId) return ... 403.

So the type signature and the documented/runtime contract disagree: TypeScript will not flag an EventStore that omits the method, yet such an implementation hard-fails every replay at runtime.

Why the ? no longer represents a real optional capability

Before this PR the method was genuinely optional — the JSDoc said "If not provided, the SDK will use the streamId returned by replayEventsAfter". After this PR that fallback is gone: the 403 gate fires before replayEventsAfter is ever called, and replayEventsAfter's return value is now explicitly ignored (per the new comment at line 568). So:

  • replayEventsAfter is non-optional in the interface, but is now unreachable unless getStreamIdForEventId (typed optional) is present. The interface has a required method gated by an optional one.
  • The interface's own purpose statement (line 25: "Interface for resumability support via event storage") is exactly the thing this method is "required for". An EventStore without it isn't a degraded EventStore — it's strictly worse than passing no eventStore at all: storeEvent() is still called on every SSE message (memory/storage growth, event ids emitted to the client) but the stored events can never be replayed.

So the ? doesn't model "you can skip this if you don't need feature X"; it models a configuration that compiles cleanly and then 403s the only thing the interface exists to do.

Step-by-step proof

  1. A user implements EventStore from the interface, providing only the two non-optional members storeEvent and replayEventsAfter. TypeScript is happy; their IDE shows getStreamIdForEventId? as optional and gives no hint.
  2. They pass it to new WebStandardStreamableHTTPServerTransport({ eventStore }).
  3. Client opens a stream; server calls storeEvent() on every SSE write — events accumulate.
  4. Client disconnects and reconnects with Last-Event-ID: <id>.
  5. replayEvents() hits if (!this._eventStore.getStreamIdForEventId) → returns 403 "Forbidden: event store does not support session-scoped replay". replayEventsAfter — the method they were required by the type system to implement — is never invoked.
  6. Net: they've paid the storage cost of resumability and shipped an implementation the compiler accepted, but resumption never works.

Why nothing prevents it

There's no runtime warning at construction time (the 403 only fires on a replay attempt), and no compile-time warning because the ? suppresses it. The JSDoc does document the consequence clearly, so a careful reader is fine — but the point of the type is to not rely on that.

Impact

Low — this is an API-surface consistency point, not a correctness bug in the transport. The runtime error message is explicit ("getStreamIdForEventId required"), and there's a defensible reason to keep the ? (avoid a TS compile break for external implementers in a patch release). But as-is the interface is internally inconsistent (required method gated by optional method) and contradicts both its own JSDoc and the migration guide. Hence nit.

Distinct from existing comments

This is not a duplicate of inline-comment #3226621736, which is about the missing changeset / reconciling the PR description's "Breaking Changes: None" — i.e. release notes. This comment is about the type signature at line 44 vs. the JSDoc/runtime contract. Also distinct from #3226621710 (migration.md placement / migration-SKILL.md parity).

Fix

Either:

  • (a) Drop the ? and make it non-optional. The PR already updates all five in-tree EventStore implementations (examples/server/src/inMemoryEventStore.ts, packages/middleware/node/test/streamableHttp.test.ts, test/conformance/src/everythingServer.ts, test/integration/test/taskResumability.test.ts, plus any others), so this is free inside the repo and matches the migration-guide framing. External implementers get a compile error pointing at exactly the method they need to add, instead of a runtime 403.
  • (b) Keep the ? for source-compat, but add one sentence to the JSDoc making that explicit, e.g. "Typed optional for backward source-compatibility only; functionally required — see above." That at least removes the appearance of an oversight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v2-stateless 2026-06 SDK: Protocol decomposition + SEP alignment (request-first / stateless)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant